1 


Marshall Space Flight Cen 4 er 
Approach In Achieveing High 
Reliability of the Saturn Class 
Vehicles 

Dr. Eberhard Rees 
Deputy Director, Technical 
Marshall Space Flight Center 
Huntsville , Alabama 

4th Annual Reliability & Maintainability Conference 
Statler -Hilton Hotel 
Los Angeles, California 
July 28-30, 1965 

On the subject of reliability and approaches to its achievement, I 
guess thousands of papers have been written and thousands of speeches 
and presentations have been delivered in the past ten years. As a matter 
of fact, barely any issue in recent years has prompted as many heated 
debates which have been carried through with almost religious fanaticism 
as the reliability question. These debates have focused mainly on reliability 
philosophies and concepts, on definitions: what is reliability after all?, 

on the mere mathematical - or better, the analytical - approach vs. common 
engineering practices. In the discussions, both parties usually went to the 
extremes and the meetings finally ended without any tangible results. Both 
parties departed to their desks, drawing boards and shops with ill feelings. 

I have been in many of these meetings, siding with the practical engineer 
because we have a hardware job to do, always with a tough time schedule 
and very limited funds , with the hardware complex and at the boundary of 



M66 29351 

(ACCESSION NUMBER) 



(PAGES) 

(NASA CR OR TMX OR AD NUMBER) 


A 


__ (COD 

3J- 


ATEGORY) 


known technology. 


Now, were these discussions really so fruitless as they seemed to be? 
On the way home after the meetings I could not get rid of some points the 
guy on the other side of the fence was making -- something sunk into me 
and my rigid concept started to become somewhat weakened. Of course, 
a hard-core man does not admit that to the outsiders - but I admitted it 
to myself and to my associates. The most difficult task was to 
convince our design and laboratory engineers and those of some of our 
contractors that, for instance, detail analysis of components and subsystems, 
the study of characteristic failure modes, the application of mathematical 
models, the use of logic diagrams, the establishment of relative reliability 
numbers or ranges of numbers, etc. , are helpful and necessary tools 
for obtaining high probabilities of mission success. Now what could 
management do to introduce these tools? 

We at Marshall set up a reliability group under the technical top 
management mainly for establishing guidelines, policies, and concepts. 
Simultaneously we established reliability groups under each Division Director 
in the various disciplines such as mechancial design, electrical design, 
guidance and control, quality assurance, etc. We insisted, however, 
that these groups consist of men who had a good background in hardware 
development. At the same time we hired a reliability contractor who had 
to work directly with our engineers in Huntsville. This integration into our 
operation of contractor personnel, who could demonstrate by day-to-day hard 
work to the hardware designer - and not merely by talking philosophy, 


helped greatly. 


2 


We‘ together with that contractor, analyzed already designed components, 
and especially subsystems, and we could show where, for instance, 
redundancy would increase reliability or where other methods of operation 
of a subsystem would decrease the possibility of malfunction. We set 
up logic diagrams for component and subsystem functions which, I believe, 
are excellent tools for analyzing a situation. To the electronic engineer 
this has been a common method but, surprisingly, the mechanical design 
engineer in the past made little use of it. We arranged design reviews 
on the component level, set up preferred parts lists, qualification programs, 
etc. This instigation and penetration program of reliability consciousness 
into our operation worked out better than I had dared to hope, so we felt 
that we could dissolve the group reporting directly to our top management 
and give the job to our Quality Assurance and Reliability Laboratory for 
the whole Center. 

Of course, in accepting some of the mathematical and analytical 

methods we did not abolish good old engineering principles whatsoever. 

I do not believe that - at least in the earlier days of our fight for the common 

engineering approach - we made too much of an impact on the theoretical 

people on the other side of the fence with our arguments. Although we, 

together with the prime contractor, conducted a very successful reliability 

program based on sound engineering practice on the Pershing Guided Missile 

for the Army, this did not come into the limelight too much. It was the big 

Saturn I launch vehicle whose reliability - at that time before the first 

launching - was questioned by reliability theoreticians. The first task 

3 



assigned to the reliability company which I mentioned before was to come 

‘up with a figure of inherent reliability on the first stage of the Saturn I 

launch vehicle design. Due to the engine-out capability designed into the 

8- engine cluster, and due to the fact that the H-l engine is a simplified 

version of the well-proven Atlas first stage engine, the Jupiter and the 

Thor engine, in combination with other factors, the mechanical part 

came out quite well - not so, the electrical part. The reliability company, 

based on lack of numbers for our design, had taken into its calculations 

figures at that time common to electronic components, especially as to 

connectors, relays, etc. We could show that the particular components 

selected for the Saturn I guidance, control, wiring, power supply, 

telemetry, etc. , were well proven in other systems and had excellent 

records, or that new parts were designed on sound engineering principles. 

The final study yielded a fairly good reliability figure of somewhere close 

to . 766 for the one- stage version of the Saturn I system prior to first 

flight. This was to the surprise of some of the more theoretically 

inclined reliability people who still believed that there might be a high 

probability that the vehicle would blow up at lift-off or during the early 

part of its flight. As you all know, it did not, although no mathematical 

reliability model was applied in its design, but mainly good engineering 

principles and sober engineering judgment. We applied a good qualification 

program and assessment of all components and subsystems and we had an 

excellent inspection, quality control and assurance program based on the 

principles which later became a salient part of the NASA NPC documents 
200-1, - 2 , and - 3. 


4 



This success proved our point to the extremists of the mathematical 
* model community and made a noticeable impact on them, too, as we became 
impressed by some of their analytical approaches which we later applied 
extensively to the Saturn I 2- stager. 

In telling you, maybe somewhat too elaborately , this story of the 
past, I just wanted you to know we went through the reliability struggle 
I feel you all had, more or less, in your own plants or program offices. 

We at Marshall Space Flight Center, together with our prime contractors 
in the Saturn launch vehicles for the Apollo program, have now arrived at 
a rather clear concept on how to achieve high reliability, or rather, 
on how to obtain high probability of mission success. It is, in one sentence, 
the application of sound and knowledgeable engineering and engineering 
judgment - let me repeat the word "engineering judgment" - based on 
long-range experience and supported by all the analytical tools I have 
mentioned before, such as detailed analysis of each component and sub- 
system, logic diagrams, mathematical models, etc. , and then most 
important, an exhaustive qualification test program, system tests program, 
and a quality assurance program according to NASA manuals NPC 200-1, -2, 
-3 and NPC 250-1 for reliability. Based on Marshall's general guidelines 
and the mentioned documents which are spelled out in the contract, each 
prime contractor establishes his own reliability program. We discuss 
these programs in detail with him but we do not, and should not, insist 
that each contractor plan and execute it exactly the same standard way. 


5 



We^feel that the contractor should have room for his own initiative and 
imagination. 

With this general approach we were able to demonstrate nine 
successful mission completions of the Saturn I launch vehicle out of nine 
launchings. Based on this record I am confident that the 10th launching, 
which happens to be tomorrow, will also be successful. This then concludes 
the Saturn I program with ten successes out of ten - I hope. Of course, 
this does not mean that the Saturn I launch vehicle now has a proven 
100% mission reliability in the statistical meaning of the word. Due to 
constraints of time and funding, we were not able to run statistical 
reliability tests of all components with the proper quantity for establishing 
real meaningful reliability figures but we can make the statement that - 
considering all these program restraints - everything feasible was done 
to make the Saturn I as highly reliable as possible and we would have had 
utmost confidence that each manned space flight mission flown on this 
launch vehicle would have been successful. 

During the execution of the Saturn I program we had - among 
minor difficulties - two salient occurrences which I think are worthwhile 
to recall briefly in context with this reliability conference - one on engine- 
out and the other on stress corrosion. 

In order to test whether the engine-out scheme on the 8-engine cluster 
of the first stage would work, we, in one of the earlier flights, deliberately 
cut off one engine. 


6 


It worked well, but we hopefully thought that this scheme would never have 
to be enacted. However, due to a malfunctioning gear box of an older 
design version, we lost one engine during the sixth flight of Saturn I. 

If I remember correctly, it was after about 90 seconds or so of flight. 

We would have lost the whole mission had the sensors and the cut-off 
of this particular engine not worked properly and the guidance system not 
corrected out the deviation from the nominal trajectory caused by the 
loss of this engine. We really obtained the proper orbit. In this case the 
inherent reliability designed into the system had saved the mission although 
a weak component had been flown. 

The other case: During a pressure test at Cape Kennedy a crack in the 

LOX dome on one engine of the first stage became apparent. An investigation 
revealed that there was a case of stress corrosion and we found beginnings 
of stress corrosion also in the DOX domes of some of the other engines. 

We pulled all the engines, and retrofitted them with new LOX domes 
from forgings of a more strrdd corrosion resistant material. This example 
shows that no reliability approach or concept could have helped us because 
when this engine was designed the best material of stress corrosion 
resistance known at that time was selected and the forging, machining 
and heat treatment process was well controlled. In the meantime, progress 
in the technological state-of-art as to stress corrosion was made. 

In both cases we and the contractor knew the weaknesses of these 
components but, due to time pressure and long lead times involved, 
we decided to take the seemingly small risk to fly them in the unmanned 


7 



version of the Saturn I because many of these types of gear boxes and 
many of these types of LOX domes had been tested and flown successfully 
before. 

With the experience and knowledge gained by Marshall Space Flight 
Center and its contractor team, and with the application of many of the 
same or similar parts, components, subsystems and systems from the 
Saturn I program, we are going into the Saturn IB and Saturn V programs 
with greater confidence. 

I don't think it is necessary to describe to you the technical 
features of these launch vehicles, how they function, and how they will 
be used. I presume this audience knows all about this. Let me point 
out, however, that the planning for these two launch vehicle classes of 
the Apollo program is markedly different from every other manned 
space flight program and from the Saturn I. The Redstone Missile and the 
Atlas Missile together had close to 200 flights before their use as boosters 
on the Mercury program. Also, the Titan II had a long record of successful 
test flights as a weapon system before the first manned Gemini mission was 
flown. 

The time schedule for the manned lunar landing mission does not 
allow such a number of test flights for the development of the Saturn IB and 
Saturn V. In addition, the high cost per launching of the huge Saturn V is 
prohibitive. On the other hand, the reliability goals for these two vehicles 
serving the extremely complex manned lunar mission have to be as close to 
the figure "one" as possible. Moreover, we have to shoot for these goals 


early, starting with the very first flight. 

8 



The few unmanned launchings allowed prior to the manned ones each carry 
important spacecraft missions in addition to the launch vehicle development 
missions. In other words, every launching has to be successful in all 
aspects. We cannot permit ourselves the luxury of learning and gaining 
experience and knowledge by failure during lift-off and flight. It is even 
mandatory that preparations for flight and countdown proceed as flawlessly 
as possible. If you add to these requirements the complexity and size, 
especially of the Saturn V, you will realize the enormous challenge of this 
program with regard to the reliability problem. 

Right from the outset it can be stated that a classical reliability 
test program resulting in a statistical reliability figure and the proof of 
numerical reliability goals by testing is not in the cards for all components 
and systems. We will, however, conduct such test programs wherever 
feasible, especially in the area of small components. Confidence in larger 
components and their proper function under prevailing environmental 
conditions will be established by a thorough qualification test program and 
by testing them within the system. Considering the constraints we have 
to live under in the program, I do not believe that we can ever prove an 
established numerical reliability goal. 

In the following I will now briefly touch on the various hardware 
groups and systems and give some few examples of our approach to 
achieving high reliability: 


9 



1 Launch vehicle stages 


For the development of each stage we have in principle established 
five full-size ground test stages in the program; namely, a stage for structural 
testing, a stage for dynamic testing, a stage for checkout of the launch 
facilities at Cape Kennedy as to its compatibility with the launch vehicle, 
a battleship stage for early hot testing of the main propulsion system, and an 
all- systems stage. This latter stage is for hot static testing of the 
configuration of the stages for the first flight and is then used for continuing 
development, engineering effort, and improvement of components and 
subsystems in the environment of the overall system. This stage is most 
important for establishing confidence in the hardware and its proper function. 
The test activity will continue throughout the program. In the S-II stage we 
saw a particular need of structural testing under cryogenic conditions. 

For this, an additional stage of full size diameter but shorter length was 
built. 

2. Instrument Unit 

This unit, which is a 3-foot high ring sitting on top of the stages, is 

the brain of the launch vehicle. It contains guidance and control equipment, 

electrical power supply and distribution, telemetry and tracking equipment, 

etc. Here we have a similar approach as in the stages; namely, the 

manufacturing of a number of full size ground test units, some of them 

fully equipped - some of them partially. Particular emphasis is given to 

structural testing, vibration testing, systems testing, compatibility testing 

with the S-IVB stage, and especially with the ground equipment for automatic 
checkout. 10 


3. . Overall launch vehicle 


For the dynamic behavior of the total configuration during the 
flight we mount the entire full-size space vehicle, including spacecraft 
dummies, in a dynamic test tower and expose it to vibration. The dynamic 
test stages and ground test instrument units are used in this test. It 
yields us natural frequencies, the bending modes, control system responses, 
etc. 

The facility stages and a test instrument unit, assembled to make 
up a complete Saturn V, are primarily used for checkout of the launch 
facility, as I mentioned before. This is an essential step for establishing 
compatibility of launch vehicle and launching site. There shortcomings 
and weaknesses in design on vehicle connections and systems, as well as 
on mechanical and electrical ground support equipment, will be revealed. 
Planned operational procedures, for instance, for propellant loading, 
handling of the vehicle, will be exercised, reviewed, and changed. We call 
this facility vehicle the M Wet Test Bird. " 

4. Automatic checkout equipment 

Within the time allotted to me for this presentation it is not possible 
to explain to you the scheme and concept of how the automatic checkout of 
stages and instrument unit works and what equipment is involved. However, 
since this checkout method, among other advantages, eliminates to a 
great extent the human error, I think it should be mentioned in this 
presentation. The prime contractors, in their plant checkout before captive 
acceptance testing and in their final checkout before shipping, are using this 


11 



automatic equipment which is compatible with the checkout equipment at 
Cape Kennedy. In order to guarantee this, to develop the system, to 
make it compatible with the instrument unit, to test the hardware of the 
system, to work out overall checkout procedures, to train operators and 
for various other purposes, especially in the field of systems integration, 
a systems development facility for this automatic checkout equipment of the 
Saturn IB and V is at the present being built up at Huntsville. 

In the mechanical field of ground equipment I want to mention one 
particular area which can cause trouble and lead to catastrophic failures. 
These are the mechanical connections between the umbilical tower at the 
launching site and the vehicle, the so-called swing arms. Again, it 
would go too far into detail to describe the functions of each swing arm 
and the environmental conditions under which it had to work. We have at 
Huntsville a swing arm test facility where all swing arms will be tested 
under various wind conditions and under cryogenic conditions as to their 
qualification, timing of their retraction, disengagement from the vehicle, etc. 

It is hard to express in reliability numbers the influence of the 
testing on both facilities but we believe that it will increase the confidence 
in reliable function of these components remarkably. 

5 . Components 

In the field of components we adhere strongly to NASA documents 

NPC 200-1, -2, and -3 and NPC 250-1. Marshall Space Flight Center, as 

I mentioned before, was instrumental in setting up these documents. The 

concepts spelled out in them are, in my opinion, some of the keystones 

for achieving high reliability. I would like to direct your attention to 

12 



two salient points in those documents which I personally believe are, 
among others, very important. These are: in-process inspection and the 

application of preferred parts lists. 

A 100% in-process inspection, if thoroughly conducted everywhere, 
expecially in the plants of our subvendors, is quite expensive. However, 

I feel it should be applied on all critical components. 

When detail design begins, we require that the parts used in the 
design be selected from the Marshall Space Flight Center Preferred Parts 
List, PPD-600, or MIL-STA-143, in that order of preference. This 
serves two purposes: standardization across the total launch vehicle 

with the resulting reduction in qualification testing required and the 
assurance that only parts with known reliability histories are used. Of 
course, these lists have to be kept up-to-date. 

In the area of mechanical components, our main trouble makers are 
still cryogenic valves, pressure switches, reducers, long cryogenic lines 
of large diameters, expansion joints , pipe connections - the ever-present 
leakage problem - and welding problems of high aluminum ajlloys involving 
large diameters and rather high precision. We try to overcome these 
difficulties by almost daily exchange of experience between us and the 
contractors and their subs. Marshall Space Flight Center devotes considerable 
in-house efforts on the design of these components, on qualification testing 
and backup solutions. As to the welding problem, for instance, we carry 
out in our shops, in cooperation with our material experts and contractors, 
developments of welding methods and conduct training of welders. 


13 


In this connection, we also develop non-destructive inspection and test methods. 
It may be of interest to you to know that we have a training program on 
non- destructive test methods going on for NASA as a whole, and executed 
by a contractor. 

As to the electrical components, we have made extensive use of 
triple redundancy with voting circuits which gives us a very high inherent 
reliability for the guidance and control system. The only place where 
such a redundancy scheme is not feasible is the stabilized platform in the 
Instrument Unit of the launch vehicle. Therefore, we have introduced 
redundancy for the stabilized platform by using the IMU (Inertial Measuring 
Unit) in the Apollo spacecraft command module as a backup. In case the 
stabilized platform, for instance, of the Saturn V launch vehicle fails, 
the IMU can take over from ignition of the S-II stage on through the whole 
launch vehicle operation. 

Before we give the go-ahead for the first flights of these big 
launch vehicles, and after a thorough design analysis of the components, 
we conduct a qualification test program for the critical components under 
all critical flight environments, which we try to simulate as closely as 
possible. We consider this test program most important and from it 
we expect to arrive at a proper confidence level for success. Since, 
however, the real function and quality of all components, subsystems, and 
systems can only be proven by exposing them to flight conditions, we equip 
the first flights with ample measuring devices. This is especially 
important since we have only a few unmanned development flights from 


14 



which we have to obtain as much knowledge as possible about the behavior 
and function of the vehicle and the real environment during all flight phases. 

In our flight measuring program on Saturn I we telemetered to the 
ground around 1000 measurements, of which we lost only Z - 3%, totally or 
partially. In the first Saturn Y flights we plan to equip the launch vehicle 
with over Z000 measurements, which ought to give us all necessary 
information to judge the quality of the vehicle and its components. If 
something goes wrong or some components function out of specified tolerances, 
we have to know this. Malfunctions have to be explained and then corrected. 
Otherwise we cannot dare to go into manned flights. 

For the Saturn V we have allocated the following preliminary reliability 
predictions or goals: 


First stage 

S-IC 


.95 

Second stage 

S-II 

= 

.95 

Third stage 

S-IVB 

= 

. 95 

Instrument Unit 

IU 

= 

.992 

GSE 


= 

. 95 


These tentative figures are based on test data from known components, on 
assessments of less :nown components, on extended test programs for 
engines, on calculated criticality numbers, and finally, on engineering 
judgment. They can be debated and, of course, criticized. For instance, 
the S-IVB stage has to be reignited in earth orbit and has to have proper 
ullage and attitude control over a longer period. It, therefore, appears that 
it ought to have theoretically a lower reliability than the other stages. On 


15 


the other hand, it has only one main engine compared with five each for 
the other stages. 

Sometime in the program there will be the crucial moment when 
management has to decide which vehicle will be assigned the first manned 
lunar landing mission. This will be done after a series of thorough and 
detailed reviews during which all previous results of ground testing, 
analyses, qualification surveys, quality assessments, single point failure 
analyses, etc. , will be put on the table and scrutinized. Checkout procedures, 
results of countdowns of previous launches, and all other operational 
procedures will be reviewed. Most important, of course, are the flight 
results of previous unmanned vehicles, Finally, with all this material 
on hand and after thorough discussion with contractor and government 
personnel, top program management will have to make a judgment 
whether confidence of all leading participants high enough and whether 
everything humanly possible has been done to make the mission a success. 
Although reliability and criticality figures will be strongly considered in 
coming to a conclusion, I do not think that the rationale will be, and can 
possible be: 

"Since a predicted reliability figure, of say .835 or .877 or similar 
for the whole launch vehicle system, has been reached, we are now ready 
to give the go-ahead, " or conversely, "Since such a figure cannot be 
proven, we are not ready, ,! In the final analysis it will be the engineering 
and management judgment of a few responsible top people. It is the task of 
the contractor-government engineering and program management team to 




16 



have the proper material prepared, enabling top management to make this 
decision. 

Let me, at the end of my presentation, express some few observations 
which I have made over long years on guided missiles and space launch 
vehicles, and which I think are worth being noticed in connection with 
the reliability effort: 

1. The inherent reliability of a launch vehicle is only as good as its 
design and engineering. This is very often conducted by personnel in the 
lower ranks and younger, freshly hired people with little experience. The 
excellent and outstanding people have left the drawing board and laboratory 
long ago and have become managers or salesmen. Mostly, they do not 
involve themselves any more in technical details. I believe this is to the 
detriment of this kind of complex technical program and to the reliability 
problem as such. I believe the middle management - even up to the top 
level - should engage itself constantly, and not only in emergencies, in 
detail design reviews, detail test planning, detail supervision of results, etc. 
Excellent designers and engineers should get incentives to stay on the 
drawing board, in active testing and in laboratories. I am sure we would 
then come up with more reliable designs. I feel the good old designer with 
novel ideas who lives with his task and is proud of his accomplishment has 
become a figure of the past. He is replaced more and more by reliability 
philosophies, computers, management experts, salesmen, representatives, 
etc. We ought to educate hard core designers and engineers again. 


17 


2. The technical difficulties in all these programs are found, 

. almost without exception, in subsystems and components. They are 

mainly procured from subvendors who often have difficulties keeping 
their small shops financially out of trouble. They need the job badly and 
therefore, bid low. These subcontractors should be more carefully 
selected, especially as to their technical capability. After award of 
contract and right from the outset they should be more closely monitored 
and more often visited by the leading design, test, and quality engineers 
of prime contractors and Government. Stronger day to day technical 
penetration of subvendors by Government and prime contractor personnel 
is a "must" because these subvendors in their struggle to meet schedules 
and costs, being under fixed price contract, resort frequently to shortcuts, 
sloppy work, fixes, neglect of proper cleaning and packaging for shipment, 
deletion of tests, and the like. This is to the detriment of reliability. 

3. When funds have to be reduced, the last thing which should be done 
is to cut down on test facilities and equipment and on the planned volume of 
testing. Elimination of inspection and quality control procedures is equally 
dangerous. We at Marshall do not believe in so-called "minimum test 
programs" because it is always debatable what this minimum is. I hate 

to see money saved, for instance, from development testing in order to 
buy immature and unqualified long lead time hardware in great quantities. 

4. Don't, on the other hand, over-inspect and over-test flight 
hardware. This is equally hostile to reliability and reduces liftime of 
components, for instance, too much static firing of flight stages or engines. 


18 



5. If, during testing or checkout of a component or a stage, 

4 there occurs an unexpected and irregular function which does not show up 
any more in a repeat test, don’t let it pass. The cause should, by all 
means, be investigated and then corrected and documented, if necessary. 

I have seen it happen quite often that during a certain step of a checkout a 
red light would flash. The test was repeated. When red did not appear 
again, everybody was happy. The checkout proceeded and was finally 
declared successfully accomplished. This is against an integer checkout 
and test concept. 

6. Although I feel we all, together with our contractor family, 
constitute a pretty good team in the Apollo program, I still believe - and 

I think this goes for other programs, too - that we can improve in communica- 
tion as to: 

Exchange and acceptance of experience. 

Collecting and transfer of test data. 

Frank submission of failure and unsatisfactory condition reports. 

Closing the loop by reporting corrective action 

Uninhibited admission and reporting of shortcomings and weaknesses 
in components and systems. 

Data and experience exchange with other programs and agencies. 


19 


t 


* There are many more such items in the line of communication, 

' where improvement ought to be achieved. 

I honestly believe that this last point which I have been trying to 
make represents the most important factor in achieving high reliability 
in such a program as the Apollo. 


# 


ZO 



